16 research outputs found

    Towards generating and evaluating iconographic image captions of artworks

    Get PDF
    To automatically generate accurate and meaningful textual descriptions of images is an ongoing research challenge. Recently, a lot of progress has been made by adopting multimodal deep learning approaches for integrating vision and language. However, the task of developing image captioning models is most commonly addressed using datasets of natural images, while not many contributions have been made in the domain of artwork images. One of the main reasons for that is the lack of large-scale art datasets of adequate image-text pairs. Another reason is the fact that generating accurate descriptions of artwork images is particularly challenging because descriptions of artworks are more complex and can include multiple levels of interpretation. It is therefore also especially difficult to effectively evaluate generated captions of artwork images. The aim of this work is to address some of those challenges by utilizing a large-scale dataset of artwork images annotated with concepts from the Iconclass classification system. Using this dataset, a captioning model is developed by fine-tuning a transformer-based vision-language pretrained model. Due to the complex relations between image and text pairs in the domain of artwork images, the generated captions are evaluated using several quantitative and qualitative approaches. The performance is assessed using standard image captioning metrics and a recently introduced reference-free metric. The quality of the generated captions and the model’s capacity to generalize to new data is explored by employing the model to another art dataset to compare the relation between commonly generated captions and the genre of artworks. The overall results suggest that the model can generate meaningful captions that indicate a stronger relevance to the art historical context, particularly in comparison to captions obtained from models trained only on natural image datasets

    Computational detection of stylistic properties of paintings based on high-level image feature analysis

    No full text
    Dostupnost velikih kolekcija digitaliziranih slikarskih djela otvorila je mogućnost novih istraživačkih pristupa u analizi likovne umjetnosti, zasnovanih na razvoju i primjeni metoda računalnog vida i strojnog učenja. Cilj je istraživanja ovog doktorskog rada ostvarenje metoda za računalnu detekciju i analizu stilskih obilježja slikarskih djela. Razvoju tih metoda pristupa se prema uzoru na povijesno-umjetničku analizu djela koja obuhvaća tri razine razmatranja: kategorizaciju, formalnu analizu i doživljajnu analizu. Metode za sve razine pristupa zasnivaju se na primjeni dubokih konvolucijskih neuronskih mreža. Prva razina pristupa odgovara problematici automatske klasifikacije slika. Provedena je usporedna analiza različitih postavki učenja mreža te su postignuti trenutno najbolji rezultati klasifikacijske točnosti za većinu predstavljenih zadataka klasifikacije slikarskih djela. Druga razina pristupa ostvaruje se razvojem metode kvantifikacije zastupljenosti specifičnih stilskih obilježja i predikcije vrijednosti tih obilježja zasnovana na učenju regresijskih modela konvolucijskih neuronskih mreža. Treća razina pristupa ostvaruje se metodom kvantifikacije subjektivnih aspekata estetske, afektivne i memorijske percepcije likovnog djela. U ovome radu prvi se puta provodi usporedna analiza predikcijskih vrijednosti navedenih percepcijskih obilježja dobivenih primjenom konvolucijskih neuronskih mreža na velikom skupu slikarskih djela. Kvantitativni i kvalitativni rezultati dobiveni primjenom predstavljenih metoda druge i treće razine pristupa sukladni su s povijesno-umjetničkim saznanjima, kao i s rezultatima ispitivanja ljudskih procjena zastupljenosti određenih obilježja u slici.The increasing availability of large digitized fine art collections opens new research perspectives in the intersection of artificial intelligence and art history. The main research objective of this thesis is the development of methods for computational detection and analysis of stylistic properties of paintings. Motivated by the successful performance of Convolutional Neural Networks (CNN) for a wide variety of computer vision tasks, this thesis explores the use of CNNs for learning features that are relevant for understanding stylistic properties of paintings. The proposed approach addresses three levels of analysing paintings: categorization, formal analysis and perceptual analysis. The first level of analysis corresponds to the task of automated image classification. Different CNN fine-tuning strategies are explored and state-of-the-art results are achieved for several art-related classification tasks. The second level of formal analysis includes training CNN regression models to predict values of features that quantify specific stylistic properties relevant for art history. The third level of perceptual analysis involves quantitative approaches to highly subjective aspects of perceiving artworks. CNN models are employed to predict scores related to three subjective aspects of human perception: aesthetic evaluation of the image, sentiment evoked by the image and memorability of the image. The presented approach enables new ways of exploring fine art collections based on highly subjective aspects of art, as well as represents one step forward towards bridging the gap between traditional formal analysis and computational analysis of fine art

    Computational detection of stylistic properties of paintings based on high-level image feature analysis

    No full text
    Dostupnost velikih kolekcija digitaliziranih slikarskih djela otvorila je mogućnost novih istraživačkih pristupa u analizi likovne umjetnosti, zasnovanih na razvoju i primjeni metoda računalnog vida i strojnog učenja. Cilj je istraživanja ovog doktorskog rada ostvarenje metoda za računalnu detekciju i analizu stilskih obilježja slikarskih djela. Razvoju tih metoda pristupa se prema uzoru na povijesno-umjetničku analizu djela koja obuhvaća tri razine razmatranja: kategorizaciju, formalnu analizu i doživljajnu analizu. Metode za sve razine pristupa zasnivaju se na primjeni dubokih konvolucijskih neuronskih mreža. Prva razina pristupa odgovara problematici automatske klasifikacije slika. Provedena je usporedna analiza različitih postavki učenja mreža te su postignuti trenutno najbolji rezultati klasifikacijske točnosti za većinu predstavljenih zadataka klasifikacije slikarskih djela. Druga razina pristupa ostvaruje se razvojem metode kvantifikacije zastupljenosti specifičnih stilskih obilježja i predikcije vrijednosti tih obilježja zasnovana na učenju regresijskih modela konvolucijskih neuronskih mreža. Treća razina pristupa ostvaruje se metodom kvantifikacije subjektivnih aspekata estetske, afektivne i memorijske percepcije likovnog djela. U ovome radu prvi se puta provodi usporedna analiza predikcijskih vrijednosti navedenih percepcijskih obilježja dobivenih primjenom konvolucijskih neuronskih mreža na velikom skupu slikarskih djela. Kvantitativni i kvalitativni rezultati dobiveni primjenom predstavljenih metoda druge i treće razine pristupa sukladni su s povijesno-umjetničkim saznanjima, kao i s rezultatima ispitivanja ljudskih procjena zastupljenosti određenih obilježja u slici.The increasing availability of large digitized fine art collections opens new research perspectives in the intersection of artificial intelligence and art history. The main research objective of this thesis is the development of methods for computational detection and analysis of stylistic properties of paintings. Motivated by the successful performance of Convolutional Neural Networks (CNN) for a wide variety of computer vision tasks, this thesis explores the use of CNNs for learning features that are relevant for understanding stylistic properties of paintings. The proposed approach addresses three levels of analysing paintings: categorization, formal analysis and perceptual analysis. The first level of analysis corresponds to the task of automated image classification. Different CNN fine-tuning strategies are explored and state-of-the-art results are achieved for several art-related classification tasks. The second level of formal analysis includes training CNN regression models to predict values of features that quantify specific stylistic properties relevant for art history. The third level of perceptual analysis involves quantitative approaches to highly subjective aspects of perceiving artworks. CNN models are employed to predict scores related to three subjective aspects of human perception: aesthetic evaluation of the image, sentiment evoked by the image and memorability of the image. The presented approach enables new ways of exploring fine art collections based on highly subjective aspects of art, as well as represents one step forward towards bridging the gap between traditional formal analysis and computational analysis of fine art